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1  Introduction 


The  pervasiveness  of  security  vulnerabilities  in  commercial  off-the-shelf  (COTS) 
computer  systems  has  prompted  much  research  in  building  survivable  (or  intrusion 
tolerant)  distributed  services.  In  this  approach,  COTS  systems  are  composed  to 
implement  a  distributed  service  that  is  robust  to  successful  attacks  on  these  individual 
components.  The  research  literature  has  documented  significant  strides  in  the 
development  of  such  services  (e.g.,  [21]  [22] [4]  [12]),  and  we  ourselves  have  constructed 
software  to  implement  such  distributed  services  in  a  previous  DARPA  program,  namely 
the  Fleet  survivable  object  store  [16].  Despite  the  successes  described  in  the  research 
literature,  significant  obstacles  remain  to  the  deployment  of  this  approach  on  a  wide 
scale. 

The  high-level  goal  of  this  research  program  was  to  address  what  we  perceive  as  the  most 
challenging  obstacle,  namely  vulnerability  to  client  compromise.  As  a  simple  example, 
consider  a  distributed  service  that  implements  the  abstraction  of  shared  files,  and  does  so 
“survivably”  in  that  it  masks  the  corruption  of  individual  file  servers  from  clients. 

Despite  this,  if  a  client  with  authority  to  write  to  a  file  is  compromised,  then  this  client 
can  arbitrarily  overwrite  the  file,  effectively  corrupting  every  server  and  rendering  the  file 
useless  to  the  application  that  requires  it. 

Our  goal  in  this  research  was  therefore  to  extend  the  Fleet  system  to  include  defense 
against  corrupt  clients.  Our  efforts  focused  on  clients  that  are  driven  by  a  human  user, 
and  that  should  be  disabled  if  the  client  device  falls  out  of  physical  possession  of  that 
user.  This  is  a  category  of  client  that  is  only  becoming  more  important,  especially  with 
the  widespread  deployment  of  mobile  devices  such  as  programmable  mobile  phones  and 
PDAs,  and  with  the  anticipated  deployment  of  wearable  computing  devices.  Indeed,  if 
our  experience  with  laptop  computers  and  mobile  phones  is  any  indication,  then  these 
devices  will  be  stolen  frequently.  And,  the  importance  of  defending  against  captured 
wearable  computers  in  battlefield  situations  should  be  obvious. 

To  defend  against  captured  clients,  we  applied  techniques  we  have  developed  that  pennit 
the  client  device  to  perform  cryptographic  operations  (e.g.,  to  digitally  sign  a  request  to 
modify  an  object)  only  after  the  device  has  convinced  a  remote  server,  here  called  a 
capture-protection  server,  that  the  device  is  still  in  the  possession  of  the  correct  user 
[14]  [15].  Our  techniques  are  particularly  powerful  in  that  the  capture-protection  server  is 
untrusted;  even  if  compromised,  it  does  not  pose  a  threat  to  the  cryptographic  keys  of  the 
device  (unless  the  device  is  also  captured).  These  techniques  are  also  powerful  in  that 
they  pennit  a  device  to  delegate  from  one  capture-protection  server  to  another,  so  that 
subsequently  the  second  server  is  authorized  to  perform  the  capture -protection  function 
for  that  device  [15].  Using  delegation,  the  device  can  ensure  that  it  has  a  capture- 
protection  server  in  relatively  close  proximity  at  all  times,  so  as  to  minimize  the  latency 
of  interacting  with  the  server  in  the  course  of  performing  cryptographic  operations. 
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The  integration  of  these  techniques  with  Fleet  promises  to  significantly  harden  Fleet 
against  this  important  class  of  threats.  At  the  same  time,  however,  it  offers  opportunities 
for  improving  our  capture-resilience  techniques.  This  potential  is  best  understood  by 
considering  the  specific  function  of  a  capture  protection  server,  i.e.,  to  confirm  that  the 
device  remains  in  the  possession  of  a  legitimate  user  before  permitting  it  to  perform  a 
cryptographic  operation.  Presuming  a  password  is  used  to  perform  this  confirmation  (as 
in  [  14]  [  1 5]),  the  server  must  limit  the  number  of  incorrect  guesses  against  the  device's 
password,  lest  it  permit  an  attacker  who  has  captured  the  device  from  progressing  too  far 
in  an  online  dictionary  attack.  When  servers  are  dynamically  authorized,  however,  this 
may  widen  the  window  of  vulnerability  to  such  an  attack:  If  the  attacker  captures  the 
device,  then  it  can  mount  an  online  dictionary  attack  against  each  currently  authorized 
server,  thereby  gaining  more  password  guesses  than  any  one  server  allows.  A  second 
security  challenge  arises  from  the  feature  that  a  capture  protection  server  can  be  disabled 
for  a  device  if  the  device  is  captured,  even  if  the  attacker  has  compromised  the  user's 
password.  Delegation  also  poses  challenges  to  disabling:  If  the  device  and  password  are 
compromised,  and  if  there  is  some  authorized  server  when  this  happens,  then  the  attacker 
can  delegate  from  this  authorized  server  to  any  server  pennitted  by  the  policy  set  forth 
when  the  device  was  initialized.  Thus,  to  be  sure  that  the  device  can  never  use  its  key 
again,  every  server  in  this  “admissible”  set  must  be  disabled  for  the  device. 

A  proper  solution  to  these  problems  would  be  for  the  capture-protection  servers  to 
coordinate  among  themselves,  e.g.,  to  inform  each  other  of  the  incorrect  password 
guesses  that  have  been  made  by  a  device.  A  focus  in  this  document  is  the  design  of  such 
an  architecture  that  supports  secure  data  sharing  among  capture  protection  servers  in  a 
way  that  reverses  the  negative  effects  of  delegation.  As  a  result,  the  number  of  password 
guesses  permitted  against  a  captured  device  is  unaffected  by  the  number  of  servers 
authorized  for  the  device,  and  disabling  the  device  at  one  authorized  server  has  the  effect 
of  disabling  the  device  at  all  servers.  However,  with  these  benefits  come  significant 
costs  to  availability,  in  that  the  failure  of  any  authorized  capture -protection  server  can 
indefinitely  prevent  the  device’s  use  of  its  cryptographic  keys. 

Fortunately,  using  the  capabilities  of  Fleet  to  build  highly  survivable  capture -protection 
servers,  we  can  largely  eliminate  this  availability  concern  in  the  capture-protection 
infrastructure.  Due  to  its  implementation  using  Fleet,  each  capture-protection  server  will 
remain  available  despite  benign  and  even  malicious  faults  (hostile  penetrations)  of  a 
fraction  of  its  replicas,  by  virtue  of  the  Fleet  replication  and  replica  coordination 
protocols.  Fleet  thus  enables  the  adoption  of  this  coordination  architecture  in  critical 
applications,  while  simultaneously  benefiting  from  it. 

In  addition  to  improving  security,  our  capture-protection  architecture  exploits  “locality  of 
reference”  by  a  mobile  user,  in  two  respects.  First,  our  approach  imposes  communication 
overhead  only  when  the  device  switches  from  using  one  capture  protection  server  to 
using  another;  after  one  interaction  to  perform  a  cryptographic  operation  with  the  new 
server,  there  is  no  additional  communication  overhead  for  subsequent  operations. 

Second,  if  delegation  patterns  follow  a  user's  travels,  the  communication  overhead  of 
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switching  servers  is  typically  incurred  only  between  the  new  server  and  the  previous  one; 
there  is  no  need  to  retrieve  data  from  a  distant  “home  location”  or  a  designated  server. 


2  Related  work 

Online  services  that  play  a  role  similar  (though  not  identical)  to  that  of  a  capture 
protection  server  include  the  modified  Kerberos  server  of  Yaksha  [1 1],  a  semi-trusted 
mediator  [2],  and  a  security  mediator  in  server-aided  signatures  [10].  These  servers  are 
interposed  in  the  critical  path  of  a  user  perfonning  cryptographic  operations  using  her 
private  key,  and  thereby  can  disable  a  private  key  that  should  no  longer  be  used.  To  our 
knowledge,  no  prior  effort  other  than  that  from  which  we  build  [15]  has  proposed  a 
notion  of  delegation  from  one  server  to  another,  and  consequently  the  issues  that  we 
attempt  to  address  here  have  not  been  previously  considered  in  these  other  efforts. 

At  the  basis  of  our  capture  protection  infrastructure  for  coordinating  capture-protection 
servers  is  a  novel  protocol  for  achieving  mutually  exclusive  access  to  a  mobile  object. 
Our  protocol  was  inspired  by  prior  algorithms  for  similar  goals  (e.g., 
[19][18][4][8][1][20][25][3][6]),  but  at  the  same  time  differs  from  them  in  significant 
ways.  First,  we  assume  a  dynamic  network  topology  determined  by  delegation  patterns, 
whereas  most  prior  work  on  distributed  mutual  exclusion  for  mobile  objects  (e.g., 

[19]  [18]  [8])  builds  on  static  topologies.  Second,  we  permit  Byzantine  node  failures  [13] 
within  our  attacker  models  (a  requirement  for  survivable  systems),  while  most  prior 
efforts  in  fault  tolerant  mutual  exclusion  for  mobile  objects  deal  with  only  benign  node 
failures  (e.g.,  [4][1][20][25][3][6]). 

3  Background  in  capture  protection 

In  this  section  we  present  background  in  capture  protection,  and  then  develop  our 
coordination  protocols  for  capture  protection  servers  in  Sections  4-6.  To  simplify  the 
discussion  in  these  sections,  we  will  avoid  discussion  of  Fleet-specific  matters  and 
presume  that  capture  protection  servers  are  non-replicated.  We  will  return  to  the 
implementation  of  these  techniques  in  Fleet  (in  which  capture  protection  servers  are 
implemented  as  replicated  Fleet  objects)  in  Section  7. 

A  capture  protection  system  consists  of  a  device  dvc  and  an  arbitrary  number  of 
computers  called  nodes,  each  denoted  nd.  Each  node  can  host  (execute)  multiple  logical 
capture  protection  servers.  A  server  is  denoted  by  svr,  typically  with  additional 
subscripts  or  other  annotations.  In  our  system,  the  device  is  used  for  generating  digital 
signatures1  (e.g.,  using  RSA  [23]),  and  does  so  by  interacting  with  one  of  the  servers  over 
a  public  network.  The  signature  operation  is  protected  by  a  password  n.  The  system  is 
initialized  with  public  data,  secret  data  for  the  device,  secret  data  for  the  user  (i.e.,  7t),  and 


1  The  device  can  also  be  used  for  decrypting  messages,  however,  for  simplicity  we  only  deal  with 
signatures  here,  as  this  is  the  most  pertinent  in  the  context  of  Fleet. 
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secret  data  for  each  node.  The  public  and  secret  data  associated  with  a  node  are  simply  a 
certified  public/private  key  pair  for  the  node,  which  are  assumed  to  be  established  well 
before  the  device  is  initialized.  We  denote  the  public  key  of  a  node  nd  by  pknc 1,  and  its 
private  key  by  sknc i.  Each  svr  has  a  public  key  pksvr  that  is  simply  the  public  key  pkn d  of 
the  node  nd  executing  svr. 

The  device-server  protocol  allows  a  device  operated  by  a  user  who  knows  n  and  enters  it 
correctly  to  sign  a  message  with  respect  to  the  public  key  of  the  device,  after 
communicating  with  one  of  the  servers.  The  device  is  initialized  with  one  server 
available  to  it,  denoted  svr*  and  executing  on  node  nd*,  though  the  device  can  cause  a 
new  server  to  be  created  on  another  node  via  delegation.  For  dvc  to  deploy  a  new  svr  on 
a  node,  another  existing  server  svr'  must  consent  to  delegating  its  authority  to  that  node, 
after  verifying  that  the  creation  of  a  server  on  that  node  is  consistent  with  policy 
previously  set  forth  by  dvc  (see  below  for  details)  and  is  being  perfonned  by  dvc  with 
the  user's  password.  In  this  way,  delegation  is  a  protected  operation  just  as  signing  is.  The 
device  can  unilaterally  revoke  a  server  when  it  no  longer  intends  to  use  that  server.  A 
node  can  be  disabled  (for  a  device)  by  being  instructed  to  no  longer  respond  to  that 
device  or,  more  precisely,  to  requests  involving  the  device's  key. 

Here  we  will  not  specify  a  policy  that  detennines  the  nodes  to  which  dvc  can  delegate, 
here  called  the  admissible  nodes,  though  we  do  assume  that  the  public  key  of  such  a  node 
can  be  determined  reliably.  The  policy  that  defines  admissibility  is  user-tunable  and  must 
be  set  when  dvc  is  initialized.  An  example  policy  might  allow  delegation  to  any  node 
with  a  public  key  certified  by  a  given  certification  authority.  (Note  this  would  also  justify 
our  assumption  that  the  public  key  of  an  admissible  node  could  be  determined.)  For  such 
a  policy,  the  admissible  nodes  are  not  known  in  advance  and  can  change  over  time.  Our 
approach  is  specifically  designed  to  accommodate  such  flexibility. 

Each  attacker  we  consider  controls  the  network;  i.e.,  the  attacker  controls  the  inputs  to  the 
device  and  every  node,  and  observes  the  outputs.  Moreover,  an  attacker  can  pennanently 
compromise  certain  resources.  The  resources  that  may  be  compromised  by  the  attacker 
are  any  of  the  nodes,  dvc,  and  n.  Compromising  reveals  the  entire  contents  of  the 
resource  to  the  attacker  and  pennits  the  attacker  to  impersonate  it.  The  one  restriction  on 
the  attacker  is  that  if  he  compromises  dvc,  then  he  does  so  after  dvc  initialization  and 
while  dvc  is  in  an  inactive  state — i.e.,  dvc  is  not  presently  executing  a  protocol  on  user 
input — and  the  user  does  not  subsequently  provide  input  to  the  device.  This  decouples 
the  capture  of  dvc  and  n,  and  is  consistent  with  our  motivation  that  dvc  is  captured  while 
not  in  use  by  the  user  and,  once  captured,  is  unavailable  to  the  user. 

We  formalize  different  aspects  of  the  system  described  thus  far  as  a  collection  of 
operations. 

•  dvc.delegate(svr,  nd):  dvc  perfonns  a  delegation  with  server  svr,  using  the 
correct  password  n,  to  deploy  a  new  server  on  nd. 

•  dvc.revoke(svr):  dvc  revokes  svr,  indicating  it  will  not  be  using  svr  in  the 
future. 
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•  nd. disable:  nd  stops  responding  to  any  requests  from  dvc  (signing  or 
delegation). 

•  dvc. comp:  dvc  is  compromised. 

•  nd.comp:  nd  is  compromised. 

•  7T.comp:  the  password  n  is  compromised. 

We  note  that  nd.comp  compromises  all  servers  ever  hosted  by  nd.  When  convenient,  we 
will  use  svr.comp  to  denote  nd.comp  where  nd  is  the  node  hosting  svr. 


4  Overview  of  algorithms 

Here  we  provide  only  the  essentials  of  how  the  delegation  and  signature  protocols  of  [15] 
work.  The  capture  protection  system  requires  a  device  initialization  phase,  for  which  the 
device  dvc  takes  as  input  its  private  key  skdvc,  the  password  n,  and  the  identity  of  node 
nd*  with  public  key  pkn d*.  The  output  of  initialization  is  a  ticket  x*  and  an  associated 
authorization  record  authrecT*  containing  secret  information  for  the  device,  x*  is  a 
ciphertext  encrypted  under pknd*  by  dvc,  the  plaintext  for  which  is  generated  as  a 
function  of  n,  a  secret  stored  in  authrecT*,  skdvc,  and  an  “inner  ticket”  C*  that  is  itself  a 
ciphertext  encrypted  under  pkn d*. 

Cryptographic  operations  by  dvc  require  that  dvc  use  a  ticket  x  and  authorization  record 
authrecT  to  induce  the  creation  of  a  logical  server  at  the  node  nd  able  to  decrypt  x  for 
processing  requests  bearing  the  ticket  x;  we  denote  this  server  by  svrT.  (In  particular, 
svr*  =  svrT*.)  nd  initializes  state  for  svrT  including  a  counter  svrT.ctr  <—  0  for  counting 
requests  bearing  x  but  reflecting  incorrect  password  guesses. 


1.  svr, 

doOperation(req) 

/*  ran  he  invoker  1  remotely  (by  some  dvc)  * j 

2. 

if  (-4romSameDvc(req.  r)) 

3. 

return  X 

/*  return  if  r  and  req  are  not  from  same  device  *  j 

4. 

if  (svrr.ctr  =  qj,,,) 

1*  flsvr.  is  max  #  of  allowed  bad  password  guesses  at  svr,  */ 

3. 

return  X 

/*  return  if  already  at  max  it  of  bad  password  guesses  */ 

6. 

if  (-4romSamellser(req.  r)) 

7. 

svrr.ctr  4-  svrr.ctr  +  1 

/*  record  a  bad  password  guess  *  / 

8. 

return  X 

/*  return  on  bad  password  guess  */ 

9. 

if  <  req.opType  =  "sign”) 

10. 

return  svr,  handleSignReqlreq) 

/*  process  sign  request  ami  return  result  */ 

11. 

else 

12. 

return  svr,.handleDelReq(req) 

/*  process  delegation  request  and  return  result  */ 

Figure  1:  svrT.doOperation  algorithm  adopted  from  [15] 


dvc  can  then  interact  with  svrx  to  either  sign  messages  or  delegate.  To  do  so,  dvc 
generates  a  request  req  as  a  function  of  n  and  the  secret  stored  in  authrecT,  as  well  as  the 
request  parameters:  The  message  m  to  be  signed  if  a  signature  operation,  or  the  identity 
and  public  key  pknd'  of  nd'  if  delegating  to  nd'.  dvc  then  invokes 
svrT.doOperation(req),  which  proceeds  as  in  Figure  1.  As  shown,  svrT  first  determines 
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if:  req  and  x  were  created  using  different  dvc  secrets  (line  2  of  Figure  1);  the  password 
mistype  counter  svrT.ctr  is  already  at  its  maximum,  qSVn  (line  4);  or  req  and  x  were 
created  using  different  passwords  (line  6).  If  any  of  these  conditions  occur,  the  request  is 
aborted  (lines  3,  5,  and  8).  Otherwise  the  request  is  processed  according  to  the 
req.opType  field  (which  is  a  string  constant,  either  sign  or  delegate)  and  a  response 
is  returned  (lines  9-12). 

If  a  signature  operation,  dvc  completes  the  signature  for  m  upon  receiving  a  valid 
response  from  svrT.  If  a  delegation  to  nd',  the  response  enables  dvc  to  generate  a  ticket 
x'  encrypted  under  pkn&.  In  this  case,  the  inner  ticket  C  is  generated  by  svrT  and  sent  to 
dvc  for  inclusion  in  x\  C  is  used  to  convey  secret  information  from  svrT  to  the  yet-to-be- 
created  svrT'. 


4.1  Security 

We  say  that  svrT  is  authorized  at  time  t  if  either  (i)  x  =  x*  or  (ii)  at  some  t'  <  t  and  before 
dvc.comp,  dvc  perfonns  dvc.delegate(svr',nd)  with  a  svr'  authorized  at  time  f  to 
obtain  output  (x,  authrecT),  and  no  dvc.revoke(svrT)  occurs  before  t.  In  (ii),  svr'  is  the 
consenting  server.  In  contrast  to  [15],  svr*  is  always  authorized  by  (i).  We  motivate  this 
in  Section  5. 

We  divide  attackers  into  four  nonoverlapping  classes,  based  on  what  they  compromise 
and  when.  We  assume  an  attacker  falls  into  one  of  these  classes  non-adaptively,  i.e.,  it 
does  not  change  its  behavior  relative  to  these  classes  depending  on  system  execution. 

A1 .  An  A1  attacker  does  not  compromise  dvc. 

A2.  An  A2  attacker  compromises  dvc,  does  not  compromise  n,  and  compromises 
no  server  authorized  at  the  time  of  dvc.comp. 

A3.  An  A3  attacker  compromises  dvc,  does  not  compromise  n,  and  compromises 
some  server  authorized  at  the  time  of  dvc.comp. 

A4.  An  A4  attacker  compromises  both  dvc  and  n,  but  does  not  compromise  any 
admissible  node. 

The  security  goals  achieved  in  [15]  against  these  attackers  are  as  follows: 

G1 .  An  A1  attacker  is  unable  to  forge  signatures  for  dvc. 

G2.  An  A2  attacker  can  forge  signatures  for  dvc  with  probability  at  most  ql |D|, 
where  q  is  the  total  number  of  queries  to  authorized  servers  after  dvc.comp, 
and  D  is  the  dictionary  from  which  the  password  is  drawn  (assumed  uniformly 
at  random). 

G3.  An  A3  attacker  can  forge  signatures  only  if  it  succeeds  in  an  offline  dictionary 
attack  on  the  password. 

G4.  An  A4  attacker  can  forge  signatures  only  until  all  admissible  nodes  are  disabled 
for  dvc. 
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These  properties  can  be  more  intuitively  stated  as  follows.  If  an  attacker  does  not  capture 
dvc  (Al),  then  the  attacker  gains  no  ability  to  forge  for  the  device  (Gl).  On  the  other 
extreme,  if  an  attacker  captures  both  dvc  and  n  (A4) — and  thus  is  indistinguishable  from 
the  user — but  does  not  compromise  any  admissible  nodes,  then  it  can  forge  only  until  all 
admissible  nodes  are  disabled  (G4).  The  “middle”  cases  are  if  the  attacker  compromises 
dvc  and  not  n.  If  it  compromises  dvc  and  no  then-authorized  server  is  ever  compromised 
(A2),  then  the  attacker  can  do  no  better  than  an  online  dictionary  attack  against  n  (G2). 

If,  on  the  other  hand,  when  dvc  is  compromised  some  authorized  server  is  eventually 
compromised  (A3),  then  the  attacker  can  do  no  better  than  an  offline  attack  against  n 
(G3). 

5  Goals 

As  motivated  in  Section  1,  our  high-level  goal  for  coordinating  capture  protection  servers 
is  to  improve  G2  and  G4  from  Section  4  (while  keeping  Gl  and  G3  unchanged).  First  we 
motivate  our  improvements  to  G2.  This  property  bounds  the  probability  that  an  A2 
attacker  can  forge  signatures  for  the  device,  as  a  function  of  the  total  number  q  of 
password  queries  that  the  attacker  can  make  to  authorized  servers  after  capturing  the 
device.  In  a  straightforward  implementation,  each  server  svr  would  individually  limit  the 
number  of  guesses  to  some  number  c/SVr,  and  refuse  to  respond  once  svr  has  received  qsvr 
queries  from  dvc  with  the  wrong  password.  In  this  case,  if  A  is  the  set  of  authorized 
servers  when  dvc  is  captured,  then  the  number  of  queries  that  the  attacker  can  make  is  q 
=  ZsvreA  c/svr-  Since  servers  are  authorized  dynamically,  G2  provides  little  assurance 
without  an  additional  mechanism  to  bound  q,  i.e.,  while  qsvr  is  limited,  q  may  not  be.  So, 
one  goal  is  to  regain  the  ability  to  limit  q  explicitly: 

G2+.  An  A2  attacker  can  forge  signatures  for  dvc  with  probability  at  most  q  /|D|, 
where  q  is  a  prespecified  constant  and  D  is  the  dictionary  from  which  the 
password  is  drawn. 

Our  second  goal  pertains  to  G4.  As  already  noted,  the  number  and  identity  of  admissible 
nodes  is  not  required  to  be  fixed,  and  it  seems  most  advantageous  for  it  to  be  specified 
more  fluidly  (e.g.,  “all  nodes  certified  by  one  of  these  three  certification  authorities”). 
Thus,  disabling  all  admissible  nodes,  as  required  in  G4,  is  a  challenge.  Even  if  the  set  of 
admissible  nodes  could  be  detennined,  disabling  each  of  them  may  require  interacting 
with  potentially  hundreds  of  far-flung  nodes  all  over  the  world.  Therefore,  a  second  goal 
that  we  adopt  here  is  to  remedy  this  problem,  by  making  one  successful  disable 
operation  at  nd*  imply  that  dvc  is  disabled  at  all  admissible  nodes: 

G4+.  An  A4  attacker  can  forge  signatures  only  until  the  time  at  which  nd*  is  disabled 
for  dvc. 
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6  Design 


Our  strategy  for  achieving  G2+  and  G4+  is  to  maintain  a  shared  counter  ctr  for  dvc  that 
records  the  number  of  incorrect  password  guesses  made  globally  against  dvc.  A  server 
can  access  this  counter  using  three  operations:  read,  increment,  and  maximize;  see 
Figure  2.  Intuitively,  read  enables  a  server  to  read  the  current  value  of  ctr,  so  that  the 
server  can  refuse  to  interact  with  dvc  if  ctr  =  q  thus  enforcing  G2+.  Upon  an 
unsuccessful  password  guess  from  dvc,  a  server  will  increment  ctr.  In  addition,  when 
nd*  is  disabled  for  dvc  it  invokes  maximize  to  set  ctr  to  <7 ,  and  then  no  server  will 
respond  to  dvc,  enforcing  G4+. 


svr,.read()  svr, .  increment  f)  svr,. maximized 

svr,  .retrieved  svr  ,.  retrieved  svr,  .retrieve; ) 

(*)  c  4-  svr,.ctr  (*)  svr,. ctr  4-  svr,. ctr  +  1  (•)  svr, .ctr  4-  q 

V'(svr,.semd  l'(svr,.semi)  V(svr,.semi) 

return  c 


Figure  2:  svr,.read,  svr,.increment  and  svr,.maximize  algorithms;  can  be  invoked  only  locally 


6.1  Mutually  exclusive  access 


We  strive  to  support  concurrent  requests  for  a  counter  from  multiple  servers  to  allow 
disabling  a  compromised  device  while  the  attacker  is  using  it,  and  to  permit  maximum 
flexibility  in  legitimate  uses  of  the  device's  private  key  (e.g.,  device  cloning).  To  ensure 
the  counter's  consistency,  our  implementation  enforces  mutually  exclusive  access. 


1. 

svr,. initialized 

2. 

svr, .parent  4-  r.getConsentingSvr{) 

/*  extract  consenting  server  from  ticket  * / 

3. 

svr,. children  4-  0 

4. 

svr,. sent.  4-  1 

/*  so  that  it  doesn't  block  to  start  with  */ 

3. 

if  svr,. parent  =  0 

/*  r  =  r-  V 

6. 

svr, .arrow  4—  svr. 

7. 

svr,  .ctr  ♦—  0 

/*  initialize  counter  •/ 

8. 

svr,. semi  *-  1 

/*  allow  incoming  retrieve  requests  */ 

9. 

else 

10. 

svr,  .arrow  4-  svr parent 

/*  initialize  arrow  to  point  to  parent  */ 

11. 

svr, .semi  4—  0 

/*  don't  have  the  counter:  block  incoming  retrieve  requests  */ 

Figure  3:  svr,.initialize  algorithm;  invoked  locally  by  the  node  hosting  svr. 


The  protocol  we  propose  for  ensuring  mutually  exclusive  access  consists  mainly  of  the 
svrT.initialize  and  svrT.retrieve  functions  shown  in  Figure  3  and  Figure  4.  svrT.initialize 
is  invoked  by  a  node  when  x  is  first  submitted  to  it,  and  svrT. retrieve  can  be  invoked 
either  by  svrT  itself  or  remotely  by  another  server.  In  a  nutshell,  each  authorized  capture 
protection  server  maintains  a  pointer — here  called  an  arrow  and  denoted  svrT.arrow — to 
the  server  from  which  it  received  the  last  request  for  access  to  the  counter.  That  is,  if 
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svrT  receives  a  request  for  the  counter,  then  svrT  requests  it  from  svr'  <—  svrT. arrow  (line 
4  in  Figure  4)  by  invoking  svr'.retrieve()  (line  12  or  15).  It  also  sets  svrT. arrow  to  be  the 
identity  of  the  requester,  denoted  caller  in  Figure  4  (line  5).  caller  is  authenticated  by 
means  that  will  be  discussed  in  Section  6.2,  so  that  if  caller  =  svr"  and  svr"  is  not 
compromised,  then  svr"  performed  this  method  invocation.  Upon  receiving  the  counter 
in  response  to  the  svr'.retrieve()  request,  svr  returns  the  counter  to  caller  (line  17). 
Figure  5  shows  the  effects  of  a  retrieve  request  initiated  by  a  server  svr. 


1. 

svr,. retrieved 

/* 

caller  is  id  of  invoking  server  */ 

2. 

if  caller  £  svr  r. children  U  {svrr. parent,  svrr) 

r 

svr,. children,  svr r. parent  described  in  Sec.  3.2  *j 

3. 

return  X 

r 

ignore  requests  from  unknowns  */ 

4. 

svr'  4-  svrr. arrow 

r 

see  who  most  recently  requested  the  counter  */ 

5. 

svrr. arrow  4-  caller 

r 

record  our  caller  as  last,  requesting  the  counter  */ 

6. 

if  (svr'  =  svr.) 

r 

if  I  most,  recently  requested  the  counter,  then  ...  */ 

7. 

P(svr,..semj) 

r 

...  block  request  until  I  complete  previous  ones,  */ 

8. 

P(svr.-.semi ) 

r 

...  block  caller  until  I’m  done  with  the  counter,  */ 

a. 

V(svrr.sem2) 

r 

...  and  permit  requests  to  make  progress  again  */ 

10. 

else  if  (caller  =  svrr) 

r 

if  I  am  the  one  requesting,  then  ...  */ 

li. 

/>(svr..semj) 

r 

...  block  request  until  I  complete  previous  ones,  */ 

12. 

svrr.ctr  4-  svr'. retrieve!  ) 

r 

...  remote  call  to  retrieve  counter  (blocks  thread).  *  j 

13. 

Vfsvrr.sem2) 

r 

...  and  permit  requests  to  make  progress  again  * / 

14. 

else 

r 

I  am  just  a  ^transit  server”  for  this  request  */ 

1 5. 

svrr.ctr  4-  svr'. retrieved 

16. 

if  (caller  ^  svr, ) 

17. 

return  svr,.ctr 

r 

return  counter  to  the  remote  caller  */ 

Figure  4:  svrx.retrieve  algorithm;  can  be  invoked  locally  or  remotely  (by  another  server) 


We  emphasize  that  svrT.retrieve()  may  be  invoked  concurrently,  e.g.,  by  multiple  remote 
servers.  For  simplicity,  the  psuedocode  of  Figure  4  and  subsequent  figures  assumes  that 
a  thread  of  execution  runs  atomically  (i.e.,  non-preemptively,  without  interference  from 
other  threads  in  svrT)  until  completion  or  until  it  blocks  either  on  a  semaphore  (line  7,  8 
or  1 1)  or  due  to  invoking  retrieve  on  another  server  (line  12  or  15).  Once  a  running 
thread  blocks,  another  can  enter  a  retrieve  operation.  We  denote  global  variables 
accessible  to  all  threads  using  the  “svrT.”  prefix,  e.g.,  svrT. arrow.  (“svrx”,  i.e.,  the 
identity  of  this  server,  is  also  global.)  Variables  without  this  prefix,  specifically  svr'  and 
caller,  are  local  to  this  thread. 

Use  of  two  different  semaphores  requires  some  explanation.  svrT.serrii  is  used  to  ensure 
that  once  the  counter  is  retrieved  by  svrx,  requests  to  pull  the  counter  away  are  blocked 
until  svrx  has  executed  its  critical  section  (the  lines  marked  “(*)”  in  Figure  2).  svrx.serri2 
is  used  to  block  any  retrieve  requests  made  by  svrx  until  svrx  services  the  retrieve 
requests  it  received  previously  from  others.  Starvation  is  avoided  if  the  retrieve  requests 
blocked  on  each  semaphore  are  serviced  in  a  first-in-first  out  order  per  V(svrx.sem;) 
invocation. 


2 

'  To  remind  the  reader,  a  semaphore  s  is  a  concurrency  control  primitive  that  represents  a 
non-negative  integer  counter  with  two  atomic  operations:  V(s)  increments  s  by  one;  P(.s) 
blocks  the  calling  thread  while  5=0  and  then  decrements  s  by  one  [9]. 
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svr.  arrow 


0  b 


svr 


<s> 

0  0 

0  0*^) 


svr 


(a)  Before  retrieve  begins  (b)  Retrieve  in  transit  (c)  After  retrieve  returns 


delegation 
retrieve  req 
counter 


Figure  5:  retrieve  initiated  by  svr 


6.2  Limiting  counter  access 

To  achieve  G2+  and  G4+,  it  is  necessary  that  only  uncompromised  servers  can  pass  the 
counter  once  dvc  is  captured;  otherwise  a  compromised  server  could  manipulate  the 
counter.  For  an  A4  attacker,  it  would  suffice  to  permit  only  admissible  nodes  to  pass  the 
counter,  since  admissible  nodes  are  uncompromised  by  assumption.  However,  for  an  A2 
attacker,  where  admissible  nodes  may  be  compromised,  this  simple  rule  does  not  suffice. 
Fortunately,  since  the  servers  authorized  when  an  A2  attacker  captures  dvc  are  never 
compromised  (by  the  definition  of  A2),  it  suffices  to  permit  only  authorized  servers  to 
pass  or  hold  the  counter.  Because  authorized  servers  are  hosted  only  on  admissible 
nodes,  this  is  consistent  with  the  A4  case. 

Our  protocol  thus  restricts  counter  passing  to  occur  only  between  authorized  servers  in 
the  A2  case.  This  is  complicated  by  the  fact  that  the  set  of  authorized  servers  is  dynamic, 
and  there  is  no  trustworthy  record  of  this  set.  This  problem  can  be  partially  alleviated  by 
having  a  consenting  server  svrT  record  all  the  servers  it  has  consented  to  authorize  in  a 
local  set  svrT. children  (see  line  2  of  Figure  4).  For  simplicity,  we  portray  svrT.children 
as  a  set  of  server  names  in  our  figures,  though  in  reality  a  different  representation  is 
required.  Specifically,  because  the  ticket  x'  resulting  from  a  delegation  to  which  svrT 
consented  is  not  known  to  svrT,  svrT  cannot  explicitly  include  x'  in  svrT.children. 
However,  if  svrT  includes  a  new  cryptographic  key  k  within  both  svrT. children  and  the 
inner  ticket  C,'  that  it  contributes  as  an  input  to  the  creation  of  x',  then  svrT  can  use  k  to 
authenticate  requests  from  svrT-.  For  reasons  described  in  Section  6.3,  svrT  also  must 
send  a  preimage  resistant  and  collision  resistant  hash  of  k,  hk,  to  dvc  for  storage  in 
authrecT-. 

To  ensure  that  the  counter  is  passed  only  between  authorized  servers,  it  is  also  necessary 
for  svrT-  e  svrT. children  to  authenticate  retrieve  requests  from  svrT.  Fortunately,  svrx' 
can  use  the  key  k  inserted  into  C,'  above  (or  another)  to  authenticate  communication  from 
svrT.  To  facilitate  svrT-  contacting  svrT  the  first  time,  svrfs  address  is  included  within  C 
and  x';  svrT-  assigns  this  address  to  svrT  . parent  (line  2  of  Figure  3).  (For  xSVr*,  a 
predetennined  constant  0  is  inserted  in  place  of  the  consenting  server  address.) 
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1. 

svr  r  .doOperation  ( req ) 

/•  can  Is-  invoked  remotely  (by  some  dvc)  */ 

2. 

if  (-'fromSameDvc(req,  r)) 

3. 

return  X 

/•  return  if  r  and  req  are  not  from  same  rlevice  */ 

4. 

c  t-  svrr.read{) 

3. 

if  <C  =  q  V  C  =  X) 

6. 

return  X 

/*  return  if  counter  at  max  or  read  failed  */ 

7. 

if  (-'fromSamellser(req,  r)) 

8. 

svrr  .increment! ) 

/•  record  a  bad  password  guess  */ 

9. 

return  X 

/*  return  on  bad  password  guess  */ 

10. 

if  (req.opType  =  "sign”) 

11. 

return  svr».handleSignReq(req) 

/•  process  sign  request  and  return  result  */ 

12. 

else  if  (req.opType  =  "delegate") 

13. 

return  svr7.handleDelReq(req) 

/•  process  delegation  request  and  return  result  */ 

14. 

else 

/•  req.opType  =  "revoke"  */ 

15. 

svr'  <-  req.getContents() 

/•  extract  server  to  !«■  revoked  */ 

16. 

if  (svr'  5?  svrr.children) 

17. 

return  X 

J*  return  if  svr'  is  not  a  child  */ 

IV 

svrr.children  «-  svrr.children  \  {svr'} 

/•  remove  svr'  from  the  children  set.  */ 

19. 

svrr.retrieve() 

/•in  case  counter  was  pulled  away  after  last  retrieve  •/ 

20. 

V'(svr7.sem, ) 

Figure  6:  New  svrrdoOperation  algorithm 


6.3  Revocation 

In  Section  6.2,  we  described  mechanisms  by  which  servers  keep  track,  in  their  children 
and  parent  variables,  of  other  authorized  servers  (and  how  to  authenticate  them). 

Because  of  this,  we  must  extend  revocation  to  update  these  variables,  so  that  servers  do 
not  work  with  outdated  information  about  which  servers  are  authorized.  Whereas 
initially  revocation  was  an  operation  local  to  dvc  [15],  here  we  extend  it  to  include 
interaction  with  a  server  to  update  its  children  variable. 

Specifically,  before  revoking  a  server  svrv,  dvc  informs  svrT. parent  of  this  revocation. 
This  notification  indicates  that  dvc  plans  to  revoke  not  just  svrT-  but  also  the  servers  in 
svrT  . children,  their  children,  and  so  forth.  The  purpose  of  dvc  revoking  the  entire  set  of 
delegations  derived  from  svrT'  is  to  ensure  that  all  still-authorized  servers  can  continue  to 
access  the  counter  for  dvc.  Doing  otherwise  could  partition  the  tree  of  delegations,  and 
the  counter  may  become  inaccessible  for  some  authorized  servers.  Note  that  svr*  is 
never  part  of  this  revoked  component.  This  is  required  since  to  achieve  G4+,  svr*  must 
be  able  to  retrieve  the  counter. 

During  revocation  of  svrx',  dvc  infonns  svrT  =  svrT  . parent  by  issuing  a  request  req  (with 
req.opType  =  revoke)  to  svrT.  The  revoked  server  svrT'  is  identified  in  req  by  hk  (hash 
of  the  key  k)  that  svrT  sent  to  dvc  during  the  delegation  protocol;  see  Section  6.2.  This 
identifier  for  svrT-  is  extracted  via  req.getContents()  (line  15  of  Figure  6).  This  request 
induces  a  removal  of  svrT'  (or  rather  k)  from  svrT. children.  Also  note  that  svrT  retrieves 
the  counter  (see  line  19  of  Figure  6)  thereby  ensuring  that  the  counter  is  not  lost  when 
svrT-  is  revoked.  svrT  retrieves  the  counter  after  removing  svrx-  from  svrT. children  so  that 
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any  subsequent  requests  from  svrT-  to  retrieve  the  counter  are  rejected.  After  this 
svrT.doOperation  call,  dvc  deletes  authrecT-,  i.e.,  performs  dvc.revoke(svrT'). 
Moreover,  it  must  invoke  dvc.revoke(svrT)  for  each  svrT"  e  svrT'. children,  their 
children,  and  so  on. 


6.4  Disabling 

To  achieve  G4+  we  require  that  nd*  can  set  the  counter  value  to  its  maximum  value  q  so 
as  to  disable  dvc  at  all  admissible  nodes.  The  revocation  mechanism  presented  in  Section 
6.3  ensures  that  svr*  can  always  request  the  counter.  Hence,  upon  receiving  a  disable 
request,  nd*  perfonns  a  svr*. maximize  operation  that  causes  servers  to  stop  responding 
to  dvc.  The  disable  algorithm  also  uses  a  capability  to  authenticate  the  disable  request; 
see  [15]. 

Note  that  the  nd. disable  request  can  also  be  sent  to  another  node  nd  hosting  an 
authorized  server  for  dvc.  However,  the  ability  of  an  A4  attacker  to  revoke  and  delegate 
makes  it  impractical  to  locate  nodes  besides  nd*  to  disable  after  dvc  has  been 
compromised.  Though  the  attacker  can  perform  a  dvc.revoke(svr*)  operation,  this  will 
not  restrict  svr*’s  access  to  the  counter  due  to  the  measures  described  in  Section  6.3. 
Hence,  nd*  is  able  to  complete  a  disable  request. 


7  Implementation  in  Fleet 

In  Sections  3-6,  we  presented  our  algorithms  for  coordinating  capture  protection  servers, 
treating  each  capture  protection  server  as  a  non-rep  heated  object.  However,  the 
coordination  protocols  we  presented,  while  improving  some  types  of  protection  (see 
Section  5),  do  exacerbate  the  effects  of  a  benign  or  malicious  capture-protection  server 
failure,  in  that  such  a  failure  could  prevent  any  server  from  being  able  to  retrieve  the 
counter  or,  therefore,  from  assisting  the  device  in  any  cryptographic  operations.  Consider 
nd*,  for  example:  In  order  for  property  G4+,  to  be  useful,  it  is  necessary  that  a  client  can 
disable  dvc  at  nd*,  which  requires  nd*  to  be  available.  More  generally,  if  svr  is  down 
and  another  server  invokes  svr.retrieve(),  then  it  is  possible  that  all  subsequent 
cryptographic  operations  by  dvc,  performed  using  any  capture  protection  server,  will 
block  at  least  until  svr  recovers.  It  is  thus  necessary  in  practice  that  such  a  protocol  be 
built  in  a  way  that  ensures  the  survivability  of  each  capture  protection  server. 

Fortunately,  the  Fleet  system,  while  being  the  primary  beneficiary  of  capture  protection 
in  this  effort,  also  provides  a  utility  for  building  such  survivable  services  [16].  Moreover, 
integration  of  this  capture-protection  infrastructure  with  the  Fleet  system  is  fairly 
straightforward;  in  principle,  Fleet  enables  survivable  implementations  of  arbitrary 
objects,  of  which  a  capture-protection  server  is  one  example;  though  it  did  require 
adaptations  to  both  the  capture-protection  infrastructure  and  Fleet,  which  we  describe 
below. 
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First,  in  our  implementation,  each  capture  protection  server  svr  is  implemented  as  a  Fleet 
object  replicated  across  several  nodes  (each  running  the  Fleet  server-side  software).  As  a 
result,  it  is  no  longer  possible  to  simply  designate  pksv r  to  be  pkn a  for  the  single  nd 
hosting  svr  (see  Section  3);  there  are  now  several  such  nodes.  Adapting  the  protocols  to 
accommodate  this,  however,  is  a  simple  matter,  by  creating  a  new  public  key/secret  key 
pair  (pksvr,  .vA'svr)  for  svr  and  delivering  .ASVr  to  each  node  nd  participating  in  the 
implementation  of  svr,  encrypted  under  pkn d,  upon  creation  of  svr. 

A  second  consequence  of  replicating  svr  is  that  svr  must  be  implemented 
deterministically,  since  the  form  of  replication  supported  in  Fleet  that  is  appropriate  for 
this  application  is  one  in  which  all  replicas  are  detenninistic  (see  [16]).  In  order  to 
support  this,  we  modified  the  capture  protection  server  implementation  to  replace  random 
choices  in  each  method  invocation  using  a  pseudorandom  function  keyed  with  (a 
cryptographic  hash  of)  skSvr  and  applied  to  method  arguments.  Provided  that  the 
pseudorandom  function  is  secure  (indistinguishable  from  a  random  function  to  those  not 
having  the  key),  then  this  results  in  no  significant  loss  to  security. 

Finally,  we  were  required  to  modify  Fleet  itself  to  support  this  application,  since  when 
one  (replicated)  capture  protection  server  calls  another  (e.g.,  to  retrieve  the  counter),  this 
involves  a  replicated  object  invoking  a  method  on  another  replicated  object,  which  was 
not  transparently  supported  in  previous  versions  of  Fleet  [16].  To  accomplish  this,  we 
adapted  the  method  invocation  protocol  from  [7]  to  better  support  replicated  clients. 

Aside  from  these  adaptations,  the  implementation  of  capture  protection  in  Fleet  closely 
reflects  the  description  in  Section  6. 


8  Summary 

In  this  report  we  detailed  a  client  capture  protection  infrastructure  for  use  within  the 
context  of  Fleet,  a  survivable  object  store.  The  innovation  in  this  work  is  a  simple  data- 
sharing  protocol,  for  capture-protection  servers,  that  strictly  limits  online  dictionary 
attacks  on  a  client  device  that  is  captured,  and  that  achieves  immediate  disabling  of  the 
client  device  even  with  dynamically  changing  server  populations. 

The  capture-protection  infrastructure  described  in  this  report  forms  a  symbiotic 
relationship  with  Fleet.  On  one  hand,  the  capture-protection  infrastructure  substantially 
hardens  Fleet  against  an  important  class  of  attack,  in  which  a  user-driven  client  device 
with  authority  to  invoke  method  invocations  on  Fleet  objects  is  captured.  This 
infrastructure  prevents  or  substantially  limits  the  damage  that  such  an  attacker  can  inflict. 
On  the  other  hand,  the  availability  of  the  algorithms  in  this  infrastructure — and  thus  the 
availability  of  the  client  device’s  private  key  operations — is  particularly  vulnerable  to 
even  the  benign  failure  of  a  capture-protection  server.  So,  implementing  each  capture- 
protection  server  as  a  survivable  Fleet  object  is  central  to  the  client’s  availability. 
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